38 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
Egyptian Arabic
Availability:
Will be Available with the publication of the paper
License:
Size:
7,7 millions sentences Production Status:
Newly created-finished
Use:
Language Modelling
-
Paper title:Arabizi Language Models for Sentiment Analysis
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Gaétan Baert | LAD | /N |
Documentation:
NoneLanguage Type:
Multilingual
Languages:
Algerian Arabic Egyptian Arabic Gulf Arabic Mesopotamian Arabic North Levantine Arabic
Availability:
From Owner
License:
MIT
Size:
~2 million Production Status:
Newly created-in progress
Use:
Language Identification
-
Paper title:A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Ryan Cotterell | Johns Hopkins University | US |
| Author 2 | Chris Callison-Burch | University of Pennsylvania | US |
| Main Contact | Ryan Cotterell | Johns Hopkins University | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Egyptian Arabic English German Spanish
Availability:
Freely Available
License:
TBD
Size:
643 sentences Production Status:
Newly created-finished
Use:
Parsing and Tagging
-
Paper title:Incrementally Learning a Dependency Parser to Support Language Documentation in Field Linguistics
-
Paper track:Under-resourced Languages
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Morgan Ulinski | Columbia University | US |
| Author 2 | Julia Hirschberg | Columbia University | US |
| Author 3 | Owen Rambow | Columbia University | US |
| Main Contact | Morgan Ulinski | Columbia University | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Egyptian Arabic English North Levantine Arabic Standard Arabic
Availability:
From Owner
License:
<Not Specified>
Size:
236913 words Production Status:
Newly created-in progress
Use:
Emotion Recognition/Generation
-
Paper title:SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Muhammad Abdul-Mageed | Indiana University | CA |
| Author 2 | Mona Diab | GWU | US |
| Main Contact | Muhammad Abdul-Mageed | The University of British Columbia | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic Egyptian Arabic Gulf Arabic North Levantine Arabic Tunisian Arabic
Availability:
Freely Available
License:
Gnu
Size:
355069 words Production Status:
Existing-updated
Use:
Document Classification, Text categorisation
-
Paper title:Arabic Dialect Identification in the Context of Bivalency and Code-Switching
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Mahmoud El-Haj | Lancaster University | GB |
| Author 2 | Paul Rayson | Lancaster University | GB |
| Author 3 | Mariam Aboelezz | British Library | GB |
| Main Contact | Mahmoud El-Haj | Lancaster University | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Egyptian Arabic
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Newly created-in progress
Use:
Morphological Analysis
-
Paper title:Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||||
|---|---|---|---|---|---|---|---|
| Author 1 | Mohamed Maamouri | <Not Specified> | None | LDC | None | Linguistic Data Consortium | US |
| Author 2 | Ann Bies | Linguistic Data Consortium, University of Pennsylvania | US | Linguistic Data Consortium | US | ||
| Author 3 | Seth Kulick | <Not Specified> | None | LDC | None | Linguistic Data Consortium | US |
| Author 4 | Michael Ciul | Linguistic Data Consortium | US | ||||
| Author 5 | Nizar Habash | Center for Computational Learning Systems, Columbia University | US | ||||
| Author 6 | Ramy Eskander | Center for Computational Learning Systems, Columbia University | US | ||||
| Main Contact | Ann Bies | Linguistic Data Consortium, University of Pennsylvania | None |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
Egyptian Arabic English Mandarin Chinese
Availability:
The Data Will Be Published Via LDC General Catalogue
License:
<Not Specified>
Size:
1936987 words Production Status:
Newly created-finished
Use:
Anaphora, Coreference
-
Paper title:Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Author 2 | Martha Palmer | Department of Linguistics and Computer Science, University of Colorado | US | ||
| Author 3 | Nianwen Xue | Computer Science Department, Brandeis University | US | ||
| Author 4 | Lance Ramshaw | Raytheon BBN Technologies | US | ||
| Author 5 | Mohamed Maamouri | <Not Specified> | None | Linguistic Data Consortium, University of Pennsylvania | US |
| Author 6 | Ann Bies | <Not Specified> | None | Linguistic Data Consortium, University of Pennsylvania | US |
| Author 7 | Kathryn Conger | Department of Linguistics and Computer Science, University of Colorado | US | ||
| Author 8 | Stephen Grimes | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Author 9 | Stephanie Strassel | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Main Contact | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Egyptian Arabic English
Availability:
From Owner
License:
<Not Specified>
Size:
4.5 hours Production Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus
-
Paper track:Speech
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Injy Hamed | The German University in Cairo | EG | ||
| Author 2 | Mohamed Elmahdy | German University in Cairo | EG | The German University in Cairo | EG |
| Author 3 | Slim Abdennadher | The German University in Cairo | EG | ||
| Main Contact | Injy Hamed | The German University in Cairo | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Arabic Egyptian Arabic
Availability:
From Owner
License:
<Not Specified>
Size:
<Not Specified> <Not Specified>Production Status:
Newly created-in progress
Use:
Document Classification, Text categorisation
-
Paper title:Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
-
Paper track:Written
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Paul Rodrigues | University of Maryland | US |
| Author 2 | Valerie Novak | University of Maryland | US |
| Author 3 | C. Anton Rytting | University of Maryland College Park | US |
| Author 4 | Julie Yelle | University of Maryland | US |
| Author 5 | Jennifer Boutz | University of Maryland | US |
| Main Contact | C. Anton Rytting | University of Maryland College Park | None |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
Egyptian Arabic English Mandarin Chinese
Availability:
The Data Will Be Published Via LDC General Catalogue
License:
<Not Specified>
Size:
2344886 words Production Status:
Newly created-finished
Use:
Semantic Role Labeling
-
Paper title:Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Author 2 | Martha Palmer | Department of Linguistics and Computer Science, University of Colorado | US | ||
| Author 3 | Nianwen Xue | Computer Science Department, Brandeis University | US | ||
| Author 4 | Lance Ramshaw | Raytheon BBN Technologies | US | ||
| Author 5 | Mohamed Maamouri | <Not Specified> | None | Linguistic Data Consortium, University of Pennsylvania | US |
| Author 6 | Ann Bies | <Not Specified> | None | Linguistic Data Consortium, University of Pennsylvania | US |
| Author 7 | Kathryn Conger | Department of Linguistics and Computer Science, University of Colorado | US | ||
| Author 8 | Stephen Grimes | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Author 9 | Stephanie Strassel | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Main Contact | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | None |
Documentation:
<Not Specified>




